Add Column
Add Column is your Swiss Army knife for creating new data in your dataset. Whether you want to combine existing columns, perform calculations, or transform your data in any way, Add Column is where the magic happens!
What Can You Do With Add Column?
Combine Information π
- Join first and last names into a full name
- Merge address fields into a complete address
- Concatenate any text fields together
Perform Calculations π’
- Calculate totals (like
price * quantity
) - Work out percentages (like
(score / total) * 100
) - Convert units (like kilometers to miles)
Create Categories π·οΈ
- Group customers by spending levels
- Classify products by price range
- Flag high-priority items
Clean Up Data π§Ή
- Remove unwanted spaces from text
- Fix inconsistent formatting
- Standardize values
Add Business Logic π‘
- Calculate discounts based on conditions
- Apply different tax rates
- Set up complex classification rules
𧱠Basic Building Blocks
Literals
These are your raw values used for direct comparisons, calculations, and thresholds. Use them when you need fixed values - like filtering orders above 1000
, applying a 0.15
tax rate, or checking if status equals "active"
.
Type | Technical Name | Examples | Common Uses |
---|---|---|---|
Integers | Whole Numbers | 123 , -456 , 1000 |
Order quantities, Counting, Exact thresholds |
Floating Point | Decimal Numbers | 123.45 , -0.5 , 0.15 |
Prices, Percentages, Precise measurements |
Strings | Text Values | "hello world" , "active" , "SKU-123" |
Status labels, Names, IDs |
Booleans | True/False Values | true , false |
Flags, Status checks, Logical operations |
Null | Empty/Missing Value | null |
Missing data, Default states, Optional values |
Variables/Columns
Reference your dataset columns or create aliases for cleaner expressions. Especially useful when you need to refer to the same column multiple times or want to make your expressions more readable. Column bindings help simplify complex transformations.
- Standard identifiers:
column_name
,my_variable
- Column bindings:
a * 2
- Template syntax:
${column_name}
- This is useful when you want to be more specific in referencing a column, such as
${revenue} * ${quantity}
for two numbers, or to concatenate two columns together regardless of what the datatype is, such as `${price} + " " + ${quantity}.
- This is useful when you want to be more specific in referencing a column, such as
Note: Currently, when referencing columns by name, spaces and special cases are ignored. E.g. "First Name" should be referenced as "firstname" in a formula.
Operators
Arithmetic Operators
- Addition:
a + b
- Subtraction:
a - b
- Multiplication:
a * b
- Division:
a / b
- Negation:
-a
Comparison Operators
- Equal:
a == b
- Not Equal:
a != b
- Less Than:
a < b
- Less Than or Equal:
a <= b
- Greater Than:
a > b
- Greater Than or Equal:
a >= b
Logical Operators
- AND:
a and b
- OR:
a or b
- NOT:
!a
Data Structures
Arrays
Arrays are ordered lists of values. In this system they can contain different types of values in the same array.
- Array literal:
[1, 2, 3]
- Array indexing:
arr[0]
- Can contain mixed types: `[1, "text", true]
Structs/Objects
Structs are collections of named fields that can contain different types of values. They're useful for grouping related data together.
- Struct literal:
{ name: "John", age: 30 }
- Can be created through {name: name, age: age}
- This is useful for conducting joins and useful aggregations.
- Nested structures:
{ user: { name: "John", details: { age: 30 } } }
Functions
Built-in Functions
str(value)
: Convert value to stringtype(value)
: Get type of valuelen(value)
: Length of string or arrayisnull(value)
: Check if value is nulltrim(string)
: Remove whitespace from string
Complex Expressions
Function Calls
len("hello world") = 11
trim(" hello ") = hello
str(123) = 123
Nested Expressions
(10 * (5 + 3)) / 2 = 40
array[indexOf(items, "search")]
Conditional Logic
age >= 18 and (status == "active" or role == "admin")
!isnull(value) and value > 0
If Statements
You can use if statements for conditional logic using this syntax:
if condition then value1 else value2
For example:
if ${price} > 100 then "expensive" else "cheap"
if ${age} >= 18 then "adult" else "minor"
You can combine conditions using logical operators:
if ${age} >= 18 and ${income} > 50000 then "qualified" else "not qualified"
Window Functions
Window functions allow you to access data from other rows relative to the current row:
lag(column)
: Gets the value from the previous row for that column
lead(column)
: Gets the value from the next row for that column
For example:
lag(${price}) // Gets previous row's price
lead(${quantity}) // Gets next row's quantity
Column Bindings
You can create column bindings at the start of an expression using semicolon syntax:
a=column1, b=column2; (a * 2) + b
Type Coercion
- Numbers in binary operations are automatically coerced to the appropriate type
- String concatenation is performed with
+
when both operands are strings - Explicit type conversion is available through the
str()
function
Error Handling
The parser will provide error messages for:
- Invalid syntax
- Unknown variables/columns
- Type mismatches
- Invalid function calls
- Duplicate field names in structs
- Invalid array indices
- Unterminated strings or identifiers
Examples
Basic Arithmetic
price * quantity * (1 - discount)
String Operations
trim(first_name) + " " + last_name
Array Operations
values[0] + values[len(values) - 1]
Complex Conditions
amount > 1000 and (
status == "approved" or
(role == "admin" and override == true)
)
Column Transformations
a=original_column;
{
raw: a,
normalized: a / 100,
category: str(a) + "_category"
}
Best Practices
- Use parentheses to make complex expressions more readable and to ensure correct operator precedence
- Use column bindings for clarity when referencing the same column multiple times
- Break complex expressions into smaller parts using struct literals
- Use meaningful names for column bindings
- Prefer explicit type conversions over implicit coercion for clarity
Implementation Notes
- All expressions maintain their source location information for error reporting
- The formula box is case-sensitive
- Whitespace is ignored except within string literals
- Comments are supported using
// comment
syntax - Columns will evaluate in real time - if it works, you should see the result.
π Pro Tips
- Use parentheses to make complex calculations clearer:
(price + tax) * quantity
- Break down complex formulas into smaller parts using column bindings
- When in doubt, use the ${column_name} syntax for column names
- Test your formulas with small examples first
- Use comments to explain complex calculations:
// Calculate price after tax
β Common Mistakes to Avoid
- Forgetting quotes around text: Use
status == "active"
, notstatus == active
- Mixing up
=
and==
: Use==
for comparing,=
for column bindings - Not handling null values: Check with
isnull()
when needed - Case sensitivity:
"Hello"
and"hello"
are different - Forgetting to wrap column names with spaces in
${}
π Troubleshooting
If your formula isn't working:
- Check for matching quotes and parentheses
- Verify column names match exactly
- Make sure you're using the right data types
- Look for typos in function names
- Use the
type()
function to check what kind of data you're working with
Remember: The preview will update in real-time, so you can see immediately if your formula works!